Speaker transformation using sentence HMM based alignments and detailed prosody modification

نویسندگان

  • Levent M. Arslan
  • David Talkin
چکیده

This paper presents several improvements to our voice conversion system which we refer to as Speaker Transformation Algorithm using Segmental Codebooks (STASC)[2]. First, a new concept, sentence HMM, is introduced for the alignment of speech waveforms sharing the same text. This alignment technique allows reliable and high resolution mapping between two speech waveforms. In addition, it is observed that energy and speaking rate differences between two speakers are not constant across all phonemes. Therefore a codebook based duration and energy scaling algorithm is proposed. Finally, a more detailed pitch modification is introduced that takes into account pitch range differences between source and target speakers in addition to mean pitch level differences. The proposed changes made a significant impact on the quality of transformed speech. Subjective listening tests showed that intelligibility is maintained at the same level as natural speech after the speaker transformation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Context labels based on "bunsetsu" for HMM-based speech synthesis of Japanese

A new set of context labels was developed for HMM-based speech synthesis of Japanese. The conventional labels include those directly related to sentence length, such as number of “mora” and order of breath group in a sentence. When reading a sentence, it is unlikely that we count its total length before utterance. Also a set of increased number of labels is required to handle sentences with var...

متن کامل

Investigation of Frame Alignments for GMM-based Text-prompted Speaker Verification

The frame alignment acts as an important role in GMM-based speaker verification. In text-prompted speaker verification, it is common practice to use the transcriptions to align speech frames to phonetic units. In this paper, we compare the performance of alignments from hidden Markov model (HMM) and deep neural network (DNN), using the same training data and phonetic units. We incorporate a pho...

متن کامل

Non-Native Text-to-Speech Preserving Speaker Individuality Based on Partial Correction of Prosodic and Phonetic Characteristics

This paper presents a novel non-native speech synthesis technique that preserves the individuality of a non-native speaker. Crosslingual speech synthesis based on voice conversion or Hidden Markov Model (HMM)-based speech synthesis is a technique to synthesize foreign language speech using a target speaker’s natural speech uttered in his/her mother tongue. Although the technique holds promise t...

متن کامل

HMM-based speech synthesis with various degrees of articulation: A perceptual study

HMM-based speech synthesis is very convenient for creating a synthesizer whose speaker characteristics and speaking styles can be easily modified. This can be obtained by adapting a source speaker’s model to a target speaker’s model, using intra-speaker voice adaptation techniques. In this article, we focus on high-quality HMM-based speech synthesis integrating various degrees of articulation, ...

متن کامل

MeLos: Analysis and Modelling of Speech Prosody and Speaking Style

This thesis addresses the issue of modelling speech prosody for speech synthesis, and presents MeLos: a complete system for the analysis and modelling of speech prosody “the music of speech”. Research into the analysis and modelling of speech prosody has increased dramatically in recent decades, and speech prosody has emerged as a crucial concern for speech synthesis. The issue of speech prosod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998